{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### CycleGAN"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Bài báo: [Link](https://arxiv.org/abs/1703.10593\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Chúng ta có thể xây dựng một ứng dụng để chuyển style một bức ảnh bình thường sang một bức ảnh tương tự với phong cách vẽ của Vangogh\n",
    "\n",
    "Bài viết này được tham khảo từ repo của bạn **vanhuyz** về [CycleGAN](https://github.com/vanhuyz/CycleGAN-TensorFlow)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1. Giới thiệu Dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Kết quả test:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"images/styletransfer.jpeg\" />"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2. Download Dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "DATASET_URL='https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/vangogh2photo.zip'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "!rm -rf data\n",
    "!mkdir data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "TARGET_FILE='./data/vangogh2photo.zip'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "WARNING: timestamping does nothing in combination with -O. See the manual\n",
      "for details.\n",
      "\n",
      "--2019-08-06 11:52:56--  https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/vangogh2photo.zip\n",
      "Resolving people.eecs.berkeley.edu (people.eecs.berkeley.edu)... 128.32.189.73\n",
      "Connecting to people.eecs.berkeley.edu (people.eecs.berkeley.edu)|128.32.189.73|:443... connected.\n",
      "HTTP request sent, awaiting response... 200 OK\n",
      "Length: 306590349 (292M) [application/zip]\n",
      "Saving to: ‘./data/vangogh2photo.zip’\n",
      "\n",
      "./data/vangogh2phot 100%[===================>] 292.39M  3.38MB/s    in 99s     \n",
      "\n",
      "2019-08-06 11:54:36 (2.95 MB/s) - ‘./data/vangogh2photo.zip’ saved [306590349/306590349]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "!wget -N $DATASET_URL -O $TARGET_FILE"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "TARGET_DIR='./data/vangogh2photo/'\n",
    "!unzip $TARGET_FILE -d ./data/\n",
    "!rm $TARGET_FILE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tree folder lưu trữ data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 3. Load những thư viện cần thiết\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "import random\n",
    "import os\n",
    "try:\n",
    "    from os import scandir\n",
    "except ImportError:\n",
    "    from scandir import scandir"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 4. Đọc data và ghi thành tf records"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Sử dụng thư viện scandir để lấy đường dẫn của tất cả ảnh trong một thư mục\n",
    "# Đọc thêm về thư viện này ở đây: https://pypi.org/project/scandir/\n",
    "\n",
    "def read_data(target_dir):\n",
    "    \"\"\"\n",
    "    Read images in target dir \n",
    "    Args: \n",
    "        target_dir: string, path to the target directory\n",
    "    Returns:\n",
    "        images: list, list of images paths\n",
    "    \"\"\"\n",
    "    \n",
    "    images = []\n",
    "    for image in scandir(target_dir):\n",
    "        if image.name.endswith('.jpg') and image.is_file():\n",
    "            images.append(image.path)\n",
    "    # Shuffle images index\n",
    "    # Why random.seed()\n",
    "    # Seeding a pseudo-random number generator gives it its first \"previous\" value. \n",
    "    # Each seed value will correspond to a sequence of generated values for a given random number generator. \n",
    "    # That is, if you provide the same seed twice, you get the same sequence of numbers twice.\n",
    "    random.seed(12345)\n",
    "    all_indexs = list(range(len(images)))\n",
    "    random.shuffle(all_indexs)\n",
    "    \n",
    "    # Shuffle images paths in images list based on shuffed indexes\n",
    "    shuffed_images = []\n",
    "    for i in all_indexs:\n",
    "        shuffed_images.append(images[all_indexs[i]])\n",
    "    return shuffed_images\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sử dụng tf.Example để thực hiện đọc record nhanh hơn (vì có sử dụng caching khi xử lý data). Xem thêm tại [đây](https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/load_data/tf_records.ipynb#scrollTo=3pkUd_9IZCFO)\n",
    "\n",
    "Bản chất tf.Example là một cái mapping {\"string\": tf.train.Feature}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Một số hàm convert dữ liệu thành tf.train.Feature"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "# The following functions can be used to convert a value to a type compatible\n",
    "# with tf.Example.\n",
    "\"\"\"\n",
    "print(_bytes_feature(b'test_string'))\n",
    "\n",
    "bytes_list {\n",
    "  value: \"test_string\"\n",
    "}\n",
    "\n",
    "\"\"\"\n",
    "\n",
    "def _bytes_feature(value):\n",
    "  \"\"\"Returns a bytes_list from a string / byte.\n",
    "  \"\"\"\n",
    "  return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))\n",
    "\n",
    "def _float_feature(value):\n",
    "  \"\"\"Returns a float_list from a float / double.\"\"\"\n",
    "  return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))\n",
    "\n",
    "def _int64_feature(value):\n",
    "  \"\"\"Returns an int64_list from a bool / enum / int / uint.\"\"\"\n",
    "  return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "def convert_data_to_tf_record(target_dir, tfrecord_file):\n",
    "    \"\"\"\n",
    "    Convert Data to TFRecord\n",
    "    \"\"\"\n",
    "    # Create tfrecords folder name if it does not exists\n",
    "    \n",
    "    tf_records_dir = tfrecord_file.split('/')[-2]\n",
    "    \n",
    "    tf_records_dir = os.path.dirname(tfrecord_file)\n",
    "    \n",
    "    if not os.path.exists(tf_records_dir):\n",
    "        try:\n",
    "            print('Creating tfrecords folder')\n",
    "            os.makedirs(tf_records_dir)\n",
    "        except:\n",
    "            print('Failed to create tfrecords folder')\n",
    "\n",
    "\n",
    "    images = read_data(target_dir)\n",
    "        \n",
    "    writer = tf.python_io.TFRecordWriter(tfrecord_file)\n",
    "    \n",
    "    for i in range(len(images)):\n",
    "        image_path = images[i]\n",
    "        image_name = image_path.split('/')[-1]\n",
    "        with tf.gfile.FastGFile(image_path, 'rb') as f:\n",
    "            image_data = f.read()\n",
    "            \n",
    "        # create example\n",
    "        feature = {\n",
    "            'image_name': _bytes_feature(tf.compat.as_bytes(os.path.basename(image_name))),\n",
    "            'image_data': _bytes_feature((image_data))\n",
    "        }\n",
    "            \n",
    "        example = tf.train.Example(features=tf.train.Features(feature=feature))\n",
    "        writer.write(example.SerializeToString())\n",
    "    \n",
    "    print('Finished.')\n",
    "    writer.close()\n",
    "        \n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Converting X data to tf record\n",
      "Finished.\n",
      "Converting Y data to tf record\n",
      "Finished.\n"
     ]
    }
   ],
   "source": [
    "x_dir = './data/vangogh2photo/trainA'\n",
    "y_dir = './data/vangogh2photo/trainB'\n",
    "x_tfrecord_file = './data/vangogh2photo/tfrecords/vangogh.tfrecords'\n",
    "y_tfrecord_file = './data/vangogh2photo/tfrecords/photo.tfrecords'\n",
    "# Writing tf records for normal photos\n",
    "print('Converting X data to tf record')\n",
    "convert_data_to_tf_record(x_dir, x_tfrecord_file)\n",
    "# Writing tf records for vangogh photos\n",
    "print('Converting Y data to tf record')\n",
    "convert_data_to_tf_record(y_dir, y_tfrecord_file)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 5. Khai báo các layers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Các ops\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "def batch_norm(input_data):\n",
    "    \"\"\"\n",
    "    Batch Normalization\n",
    "    \"\"\"\n",
    "    return tf.contrib.layers.batch_norm(x, decay=0.9, updates_collections=None, epsilon=1e-5, scale=True, scope=name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Đọc thêm về sự khác nhau giữa Batch Norm và Instance Norm: [tại đây](https://stackoverflow.com/questions/45463778/instance-normalisation-vs-batch-normalisation)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Đọc thêm về [Leaky ReLU](http://cs231n.github.io/neural-networks-1/) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "def instance_norm(input_data):\n",
    "    \"\"\"\n",
    "    Instance Normalization\n",
    "    \"\"\"\n",
    "    \n",
    "    with tf.variable_scope(\"instance_norm\"):\n",
    "        epsilon = 1e-5\n",
    "        depth = input_data.get_shape()[-1]\n",
    "        mean, var = tf.nn.moments(x, [1, 2], keep_dims=True)\n",
    "        scale = tf.get_variable('scale', depth, initializer=tf.truncated_normal_initializer(mean=1.0, stddev=0.02))\n",
    "        offset = tf.get_variable('offset', depth,initializer=tf.constant_initializer(0.0))\n",
    "        normalized_input_data = scale*tf.div(x-mean, tf.sqrt(var+epsilon)) + offset\n",
    "        return normalized_input_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tổng hợp 2 hàm norm lại"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
    "def norm(input_data, type=\"batch_norm\"):\n",
    "    if type == \"batch_norm\":\n",
    "        return batch_norm(input_data)\n",
    "    elif type == \"instance_norm\":\n",
    "        return instance_norm(input_data)\n",
    "    else:\n",
    "        return input_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\\$\\$ f(x) = \\mathbb{1}(x < 0) (\\alpha x) + \\mathbb{1}(x>=0) (x) \\$\\$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [],
   "source": [
    "def lrelu(x, leak=0.3):\n",
    "    return tf.maximum(x, x*leak)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tổng hợp 2 hàm relu"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [],
   "source": [
    "def applyRelu(x, type=\"relu\", leak=0.3):\n",
    "    if type == \"relu\":\n",
    "        return tf.nn.relu(x, \"relu\")\n",
    "    elif type == \"lrelu\":\n",
    "        return lrelu(x, leak)\n",
    "    else:\n",
    "        return x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[Cycle GAN model Architecture](https://hardikbansal.github.io/CycleGANBlog/)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Standard Deviation\n",
    "stddev = 0.02"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "code_folding": []
   },
   "outputs": [],
   "source": [
    "def conv2d(inputs, num_outputs, kernel_size, stride, stddev=0.02,name=\"conv2d\", padding=\"VALID\", activation_fn=None, norm='instance', isApplyRelu=True, reluType=\"leaky\"):\n",
    "    with tf.variable_scope(name):\n",
    "        conv = tf.contrib.layers.conv2d(inputs, num_outputs, kernel_size, stride, padding, activation_fn=None, weights_initializer=tf.truncated_normal_initializer(stddev=stddev), biases_initializer=tf.constant_initializer(0.0))\n",
    "        conv = norm(conv, \"instance_norm\")\n",
    "        if isApplyRelu:\n",
    "            conv = applyRelu(conv, type=reluType)\n",
    "        return conv\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [],
   "source": [
    "def deconv2d(inputs, num_outputs, kernel_size, stride, stddev=0.02, name=\"deconv2d\", padding=\"VALID\", activation_fn=None, norm='instance'):\n",
    "    with tf.variable_scope(name):\n",
    "        conv = tf.contrib.layers.conv2d_transpose(inputs, num_outputs, kernel_size, stride, padding, activation_fn=None, weights_initializer=tf.truncated_normal_initializer(stddev=stddev),biases_initializer=tf.constant_initializer(0.0))\n",
    "        conv = norm(conv, \"instance_norm\")\n",
    "        conv = applyRelu(conv, type=reluType)\n",
    "        return conv\n",
    "    \n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"./images/Resnet.jpg\" width=500/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [],
   "source": [
    "def resnet_block(input_data, name, norm=\"instance_norm\"):\n",
    "    \"\"\"\n",
    "    Resnet Block contains two convolutional layers\n",
    "    Returns:\n",
    "        Output has the same dimension with the input\n",
    "    \"\"\"\n",
    "    with tf.variable_scope(name):\n",
    "        \n",
    "        input_data_shape = input_data.get_shape()\n",
    "\n",
    "        conv1_padded_input = tf.pad(input_data, [[0,0], [1,1], [1,1], [0,0]], 'REFLECT')\n",
    "        conv1 = conv2d(conv1_padded_input, input_data_shape[-1], [3, 3], [1, 1], stddev=0.02, name='{0}_conv1'.format(name), padding='VALID')\n",
    "        conv2_padded_input = tf.pad(conv1, [[0,0], [1,1], [1,1], [0,0]], 'REFLECT')\n",
    "        conv2 = conv2d(conv2_padded_input, input_data_shape[-1], p[3, 3], [1,1], stddev=0.02, padding='VALID', name=\"{}_conv2\".format(name))\n",
    "    \n",
    "    print(tf.nn.relu(input_data + conv2).get_shape())\n",
    "    return tf.nn.relu(input_data + conv2)\n",
    "\n",
    "def n_resnet_blocks(input_data, n=6, norm=\"instance_norm\"):\n",
    "    for i in range(1, n+1):\n",
    "        resnet_output = resnet_block(input_data, \"resnet_{}_\".format(i), norm)\n",
    "        input_data = resnet_output\n",
    "    return resnet_output\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Generator Layers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generator nhận vào một bức ảnh có cỡ là (1, width, heigh, 3) và trả về một ảnh có kích thước tương tự"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"./images/Generator.jpg\" />"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Generator():\n",
    "    def __init__(self, name,image_size,ngf=64):\n",
    "        self.name = name\n",
    "        self.ngf = ngf\n",
    "        self.image_size = image_size\n",
    "    \n",
    "    def __call__(input_data):\n",
    "        \"\"\"\n",
    "        Args: \n",
    "            input_data: size = batch_size * input_data_width * input_data_height * channels\n",
    "        Returns:\n",
    "            return output which has the same size with input_data\n",
    "        \"\"\"\n",
    "        with tf.variable_scope(name):\n",
    "            # Encoding\n",
    "            first_filter_size = 7\n",
    "            first_padding_size = 3\n",
    "            paddings = [[0, 0], [first_padding_size, first_padding_size], [first_padding_size, first_padding_size], [0, 0]]\n",
    "            padded_input = tf.pad(input_data, paddings, 'REFLECT')\n",
    "            \n",
    "            print('padded_input', padded_input.get_shape())\n",
    "            \n",
    "            conv1 = conv2d(padded_input, self.ngf, [first_filter_size, first_filter_size], [1, 1], stddev=0.02, name=\"g_conv1\")\n",
    "            # Output shape: (?, w, h , 32)\n",
    "            conv2 = conv2d(conv1, self.ngf*2, [3, 3], [2,2], stddev=0.02, padding='SAME', name=\"g_conv2\")\n",
    "            # Output shape: (?, w/2, h/2, 64)\n",
    "            conv3 = conv2d(conv2, self.ngf*2, [3, 3], [2,2], stddev=0.02, padding='SAME', name=\"g_conv3\")\n",
    "            # Output shape: (?, w/4, h/4, 128)\n",
    "            \n",
    "            # Transformation\n",
    "            if self.image_size <= 128:\n",
    "                resnet_ouput = n_resnet_blocks(conv3, 6) # (?, w/4, h/4, 128)\n",
    "            else:\n",
    "                # Use 9 nesnet blocks for higher-resolution image\n",
    "                resnet_ouput = n_resnet_blocks(conv3, 9) # (?, w/4, h/4, 128)\n",
    "            \n",
    "            # Decoding\n",
    "            \n",
    "            deconv1 = deconv2d(resnet_ouput, self.ngf*2, [3, 3], [2, 2], stddev=0.02, padding='SAME', name='g_deconv1')\n",
    "            # Output shape: (?, w/2, h/2, 64)\n",
    "            deconv2 = deconv2d(deconv1, self.ngf, [3, 3], [2, 2], stddev=0.02, padding='SAME', name='g_deconv2')\n",
    "            # Output shape: (?, w/4, h/4, 32)\n",
    "            padded_deconv3_input = tf.pad(deconv2,[[0, 0], [3, 3], [3, 3], [0, 0]], \"REFLECT\")\n",
    "            # Output shape: (?, w, h, 32)\n",
    "            deconv3 = conv2d(padded_deconv3_input, 3, [first_filter_size, first_filter_size], [1, 1],stddev=0.02,padding=\"VALID\", name=\"g_deconv3\",isApplyRelu=False)\n",
    "            # Output shape: (?, w, h, 3)\n",
    "            \n",
    "            return deconv3\n",
    "            \n",
    "            \n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
